Advanced Metrics for Class-Driven Similarity Search

نویسندگان

  • Paolo Avesani
  • Enrico Blanzieri
  • Francesco Ricci
چکیده

This paper presents two metrics for the Nearest Neighbor Classifier that share the property of being adapted, i.e. learned, on a set of data. Both metrics can be used for similarity search when the retrieval critically depends on a symbolic target feature. The first one is called Local Asymmetrically Weighted Similarity Metric (LASM) and exploits reinforcement learning techniques for the computation of asymmetric weights. Experiments on benchmark datasets show that LASM maintains good accuracy and achieves high compression rates outperforming competitor editing techniques like Condensed Nearest Neighbor. On a completely different perspective the second metric, called Minimum Risk Metric (MRM) is based on probability estimates. MRM can be implemented using different probability estimates and performs comparably to the Bayes classifier based on the same estimates. Both LASM and MRM outperform the NN classifier with the Euclidean metric.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of BKCa channel openers by molecular field alignment and patent data-driven analysis

In this work, we present the first comprehensive molecular field analysis of patent structures on how the chemical structure of drugs impacts the biological binding. This task was formulated as searching for drug structures to reveal shared effects of substitutions across a common scaffold and the chemical features that may be responsible. We used the SureChEMBL patent database, which prov...

متن کامل

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

Yencken, Lars and Timothy Baldwin (2008) Orthographic similarity search for dictionary lookup of Japanese words, In Proceedings of the 18th European Conference on Artificial Intelligence (ECAI-08), Patras, Greece

Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of distance metrics for characters to allow learners to leverage known characters to search for words containing unknown b...

متن کامل

Orthographic similarity search for dictionary lookup of Japanese words

Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of character distance metrics to allow learners to leverage known characters to search for words containing unknown but vi...

متن کامل

On the Foundations of Data Interoperability and Semantic Search on the Web

Title of Document: ON THE FOUNDATIONS OF DATA INTEROPERABILITY AND SEMANTIC SEARCH ON THE WEB Hamid Haidarian Shahri, Doctor of Philosophy, 2011 Directed By: Professor Donald Perlis Department of Computer Science This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999